Introduction to GPU Radix Sort

نویسندگان

  • Takahiro Harada
  • Lee Howes
چکیده

The prefix sum is the sum of all values in preceding locations in the sequence: in this case those to the left of the current location. In the case of the radix sort this means that the prefix sum computes the total count of all values less than the current value. For example, the prefix sum of location, and hence value, 2 is o2 = 3. This means there are 3 entries for 0s and 1s in the sequence. Thus, 3 is the destination address of the first 2 in the data set. The destination address of an element is the sum of the offset computed via the prefix sum and the index of the value in the set of the same value in the original array: the second 2 in the array would be at location 3 + 1. The elements are shuffled by calculating the destination address to get a sorted array.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GRS - GPU radix sort for multifield records

We extend the number sorting algorithms on the GPU to sort large multi-field records. We notice that traditional way of sorting the records by first sorting a (key, index) pair to obtain the sorted permutation of the records followed by actually rearranging the entire records to their final position might not actually be the most efficient way to sort them depending on the type of sorting algor...

متن کامل

Sorting On A Graphics Processing Unit(GPU)

2.1 Graphics Processing Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.2 Sorting Numbers on GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.2.1 SDK Radix Sort Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.2.1.1 Step 1–Sorting tiles ...

متن کامل

Fast parallel GPU-sorting using a hybrid algorithm

This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achieves high speed by efficiently utilizing the parallelism of the GPU throughout the whole algorithm. Initially, GPU-based bucketsort or quicksort splits the list into enough sublists then to be sorted in parallel using merge-sort. The algorithm is of complexity n log n, and for lists of 8M elements...

متن کامل

Fast radix sort for sparse linear algebra on GPU

Fast sorting is an important step in many parallel algorithms, which require data ranking, ordering or partitioning. Parallel sorting is a widely researched subject, and many algorithms were developed in the past. In this paper, the focus is on implementing highly efficient sorting routines for the sparse linear algebra operations, such as parallel sparse matrix matrix multiplication, or factor...

متن کامل

Energy-Efficient Sorting on a Many-Core Platform

As processors move from multi-core to many-core architectures, opportunities arise for energy-efficient enterprise computations, such as sorting, on large arrays of processors. This paper proposes three different energy-efficient sorting methods for the first phase of an external sort simulated on a varying sized fine-grained many-core processor arrays used as a co-processor to an Intel CPU, wh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011